30 research outputs found
Recommended from our members
Advances in Bayesian inference and stable optimization for large-scale machine learning problems
A core task in machine learning, and the topic of this thesis, is developing faster and more accurate methods of posterior inference in probabilistic models. The thesis has two components. The first explores using deterministic methods to improve the efficiency of Markov Chain Monte Carlo (MCMC) algorithms. We propose new MCMC algorithms that can use deterministic methods as a “prior” to bias MCMC proposals to be in areas of high posterior density, leading to highly efficient sampling. In Chapter 2 we develop such methods for continuous distributions, and in Chapter 3 for binary distributions. The resulting methods consistently outperform existing state-of-the-art sampling techniques, sometimes by several orders of magnitude. Chapter 4 uses similar ideas as in Chapters 2 and 3, but in the context of modeling the performance of left-handed players in one-on-one interactive sports.
The second part of this thesis explores the use of stable stochastic gradient descent (SGD) methods for computing a maximum a posteriori (MAP) estimate in large-scale machine learning problems. In Chapter 5 we propose two such methods for softmax regression. The first is an implementation of Implicit SGD (ISGD), a stable but difficult to implement SGD method, and the second is a new SGD method specifically designed for optimizing a double-sum formulation of the softmax. Both methods comprehensively outperform the previous state-of-the-art on seven real world datasets. Inspired by the success of ISGD on the softmax, we investigate its application to neural networks in Chapter 6. In this chapter we present a novel layer-wise approximation of ISGD that has efficiently computable updates. Experiments show that the resulting method is more robust to high learning rates and generally outperforms standard backpropagation on a variety of tasks
Self-supervised object detection from audio-visual correspondence
We tackle the problem of learning object detectors without supervision.
Differently from weakly-supervised object detection, we do not assume
image-level class labels. Instead, we extract a supervisory signal from
audio-visual data, using the audio component to "teach" the object detector.
While this problem is related to sound source localisation, it is considerably
harder because the detector must classify the objects by type, enumerate each
instance of the object, and do so even when the object is silent. We tackle
this problem by first designing a self-supervised framework with a contrastive
objective that jointly learns to classify and localise objects. Then, without
using any supervision, we simply use these self-supervised labels and boxes to
train an image-based object detector. With this, we outperform previous
unsupervised and weakly-supervised detectors for the task of object detection
and sound source localization. We also show that we can align this detector to
ground-truth classes with as little as one label per pseudo-class, and show how
our method can learn to detect generic objects that go beyond instruments, such
as airplanes and cats.Comment: Under revie
TRY plant trait database - enhanced coverage and open access
Plant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives
Structural and micro-anatomical changes in vertebrae associated with idiopathic-type spinal curvature in the curveback guppy model
Background: The curveback lineage of guppy is characterized by heritable idiopathic-type spinal curvature thatdevelops during growth. Prior work has revealed several important developmental similarities to the human idiopathicscoliosis (IS) syndrome. In this study we investigate structural and histological aspects of the vertebrae that areassociated with spinal curvature in the curveback guppy and test for sexual dimorphism that might explain a femalebias for severe curve magnitudes in the population.Methods: Vertebrae were studied from whole-mount skeletal specimens of curved and non-curved adult males andfemales. A series of ratios were used to characterize structural aspects of each vertebra. A three-way analysis of variancetested for effects of sex, curvature, vertebral position along the spine, and all 2-way interactions (i.e., sex and curvature,sex and vertebra position, and vertebra position and curvature). Histological analyses were used to characterize microarchitecturalchanges in affected vertebrae and the intervertebral region.Results: In curveback, vertebrae that are associated with curvature demonstrate asymmetric shape distortion,migration of the intervertebral ligament, and vertebral thickening on the concave side of curvature. There is sexualdimorphism among curved individuals such that for several vertebrae, females have more slender vertebrae than domales. Also, in the region of the spine where lordosis typically occurs, curved and non-curved females have a reducedwidth at the middle of their vertebrae, relative to males.Conclusions: Based on similarities to human spinal curvatures and to animals with induced curves, the concaveconvexbiases described in the guppy suggest that there is a mechanical component to curve pathogenesis incurveback. Because idiopathic-type curvature in curveback is primarily a sagittal deformity, it is structurally more similarto Scheuermann kyphosis than IS. Anatomical differences between teleosts and humans make direct biomechanicalcomparisons difficult. However, study of basic biological systems involved in idiopathic-type spinal curvature incurveback may provide insight into the relationship between a predisposing aetiology, growth, and biomechanics.Further work is needed to clarify whether observed sex differences in vertebral characteristics are related to the femalebias for severe curves that is observed in the population
TRY plant trait database - enhanced coverage and open access
Plant traits—the morphological, anatomical, physiological, biochemical and phenological characteristics of plants—determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait‐based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits—almost complete coverage for ‘plant growth form’. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait–environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives
TRY plant trait database - enhanced coverage and open access
This article has 730 authors, of which I have only listed the lead author and myself as a representative of University of HelsinkiPlant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives.Peer reviewe